-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python wrapper, API #672
Python wrapper, API #672
Conversation
Thanks! I'll get into reviewing this in the coming few days 🙂 |
Update: I got python wheel construction working! here's how to run the integration testsPython development environment setup:
# prepare the binaries
cargo build --release --verbose
cargo build --release --verbose --features "extended"
bin="$PWD/target/release/pagefind"
ext="$PWD/target/release/pagefind_extended"
cd wrappers/python
# set up the python virtual environment
poetry install --no-root # for dev dependencies
export VIRTUAL_ENV="${PWD}/.venv"
export PATH="$VIRTUAL_ENV/bin:$PATH"
# build and install the binary-only wheels
python3 ./build_binary_only_wheel.py \
--llvm-triple="x86_64-unknown-linux-musl" \
--bin-path=$bin \
--version=1.1.0
python3 ./build_binary_only_wheel.py \
--llvm-triple="x86_64-unknown-linux-musl" \
--bin-path=$ext \
--version=1.1.0
poetry build # build the source-only distribution for the python API
# install all the wheels
pip install ./dist/*.whl
pip show --verbose experimental_pagefind_python_bin
pip show --verbose experimental_pagefind_python_bin_extended
pip show --verbose pagefind_python
LOG_LEVEL="DEBUG" python3 ./src/tests/integration.py 2>&1 | tee /tmp/integration_test.log After this, the python code is 90% ready for prime time. If you're OK with the API, all that's left is add python testing and publication to GitHub Actions. |
This all looks good to me so far! :) |
Regarding testing: The main branch has been updated to have much better and more reliable tests, and it would be nice to include at least some of the Python testing in that integration suite. For example, here is one of the Node API integration test files: https://github.com/CloudCannon/pagefind/blob/9a1341642fa3e61891f50603af8d3217652059cd/pagefind/integration_tests/node_api/node_base/build-a-synthetic-index-to-disk-via-the-api.toolproof.yml Since the APIs are pretty comparable, duplicating that |
I'll definitely take you up on the offer of translating the integration tests to python! FYI: I started working on publishing the python API separately in https://github.com/SKalt/pagefind_python in order to keep progress going without pestering you. I'll have to backport some of the work into this repo, which won't be way too much work, but it will introduce my more interesting automation choices here:
If you're cool with those decisions and taking on the work of maintaining the python API here, I'll pull in my work from the other repo. |
Grand! I'll tack that onto this branch (I just pushed a merge in from main to get the new suite) (no more erroneous failing tests 😍)
Looks fine to me 👍
Also all good — I also tend to use cjs scripts for anything complex. This repo already has a .backstage directory for this kind of workflow script 🙂
Yeah let's do it! Hopefully it won't need much faffing with over time since the API is pretty locked in. |
Alright! I'm currently winding down for the day, but I should be able to start the backport tomorrow. I'll @ you when this branch is up-to-date.
The only remaining bikeshed-ish question I can think of is whether to add an entry point to expose a command-line interface that doesn't need to be prefixed with There was a good discussion of the tradeoffs in the
But that decision doesn't really impact shipping this API. |
I think I agree with the logic from Zig there — I think having the It's been a while since I've been in the Python ecosystem — does that flow allow you to run Pagefind without an explicit install? I'd love an alternative fast-path to |
There's |
Hey, this PR should be pretty much ready for your updated tests! As an aside: scripting builds that rely on GitHub actions was a bit painful since I had to keep guessing at what filesystem side-effects each action takes. If you're interested in investing in a build process that can operate fully locally, I have a Makefile I could share. |
02b36be
to
2521585
Compare
Alright, I went through and added the full suite of Node API Toolproof tests for this PR! It served as a good exercise to learn asyncio and get familiar with the Python API before taking on maintenance 🙂 Some changes that came out of it:
Most of those changes are paired with a Toolproof test that triggered the error 🙂 Let me know if you see anything untoward! I'll look at tacking a publish action onto this branch to go to TestPyPi soon. |
Looks great! I've been keeping up with your storm of commits and couldn't find anything to add except code golf opportunities. I'm going to be available for the next 30 minutes, so let me know if there's anything I can do to help with the publication process. |
Excellent! I need to hop off for the afternoon, but will hopefully get a crack at the publishing tonight (NZ). Feel free to tack any golf tweaks on while you’re here 🙂 |
Oh, one last thing: |
Good to know! Wasn't aware 🙂 |
Ticking away at a few test release runs on a non-fork branch. Won't crack it tonight, but shouldn't be too big of a lift! First annoyance seems to be configuring the trusted publisher for each package. I can add a "pending publisher" in PyPi for a single package, but it errors if I add multiple pending packages with the same publisher. From what I can read though there is a many<>many relationship here, so this might only be a limitation while the publisher is pending. I was hoping to avoid setting it up to publish from my machine, so it might just take a few workflow runs back to back to get each successive package out, we'll see. Let me know if you have any ideas there. |
Also, what would be the best way to publish to the |
I did just add you as owner of the test.pypi packages. I think the easiest way to debug the trusted publisher issue is in a separate repo with a bare-bones python package, but I'm not going to get to work on that tonight so |
Hah no worries, have kicked off a test publish of all the packages. Once the projects exist pypi is happy having the same trusted publisher configured for them all — it just seems I can't configure it so that a trusted publisher is allowed to publish multiple new projects from an action. No big deal, I'll do the first publish to real pypi from a branch anyway and publish a |
Latest:
Looks like that should be |
Yep, you're right. You can import |
Awesome!! The only thing I wish I could fix is the asyncio subprocess pipe potentially getting clogged when |
Nice! I actually want to match esbuild's subprocess logic here, which is that messages over a certain size get saved to a temp file by the Pagefind service, and this file is then loaded by the wrapper. I know in their case using the filesystem benchmarked faster than stdio over a certain message size, and also has the benefit for us of preventing anything getting clogged up by keeping messages smaller. So at some point getting that added to the service and both wrappers should solve the issue 😄 Thanks a tonne for all of this, I appreciate it! I also feel like I have caught up on a good decade of progress in the Python ecosystem 😆 Very impressed with pypi for this purpose, many rough edges of the npm wrapper that don't exist here. And thanks for the patience on my side being slow! I'll merge this into main now. I have a few other small issues I'd like to pick off before the next release, but I'll set a hard deadline of two weeks to put the |
New docs can be read in the meantime at: |
That's a clever trick; thanks for telling me about it. Over the last couple of weeks/months the rust/python tool
Hey, it was great working with you! I expect we'll be in touch once I get around to writing a mkdocs plugin for pagefind. |
Cool! I'll take that for a spin
Looking forward to it 😄 |
A little delayed again, but v1.2.0 is out 😄 |
When finished, this closes #634. To most effectively review this PR, skip right to reading the integration test (wrappers/python/src/tests/integration.py).
There are still a number of FIXMEs and TODOs in the code, most of which relate to creating a consistent build of an python wheel for the appropriate OS and arch. There's a viable how-to at https://simonwillison.net/2022/May/23/bundling-binary-tools-in-python-wheels/, so stay tuned.